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Abstract 

We study the worst-case communication complexity 
of distributed algorithms computing a path problem 
based on stationary distributions of random walks in 
a network G with the caveat that G is also the com- 
munication network. The problem is a natural gen- 
eralization of shortest path lengths to expected path 
lengths, and represents a model used in many prac- 
tical applications such as pagerank and eigentrust as 
well as other problems involving Markov chains de- 
fined by networks. 

For the problem of computing a single station- 
ary probability, we prove an fl(n 2 logn) bits lower 
bound; the trivial centralized algorithm costs 0(n 3 ) 
bits and no known algorithm beats this. We also 
prove lower bounds for the related problems of ap- 
proximately computing the stationary probabilities, 
computing only the ranking of the nodes, and com- 
puting the node with maximal rank. As a corollary, 
we obtain lower bounds for labelling schemes for the 
hitting time between two nodes. 

1 Introduction 

Let G be a strongly-connected, directed, unweighted 
graph on n vertices. G defines a communication net- 
work, where nodes are processors and communication 
can only occur along edges. G also defines a random 
walk process: a token walks over the nodes (states) 



of G, at each time step choosing the next node uni- 
formly at random from the outgoing edges of the cur- 
rent node. We shall refer to this stochastic process 
as the harmonic random walk on G. 

A basic path problem in distributed computing is 
as follows. Given a network G where each node only 
knows its neighbours, compute the lengths d(v, u) of 
the shortest path in G from each node v to some 
fixed node u. We consider the natural generalization 
of this problem to expected path lengths using the 
harmonic random walk on G. Let E[d(u, v)] be the 
expected length of the walk that begins at node u 
and terminates on first hitting node v. We shall be 
interested in the values E[d(u, u)], i.e. the expected 
return times of the token. If the Markov chain defined 
by G is irreducible then these values exist and the 
random walk has a unique stationary distribution n = 
(7r(l), 7r(2), . . . , 7r(n)) where 7r(u) = l/E[d(u,u)] is 
known as the stationary probability of node u and 

If G is undirected then the stationary probability 
of any particular vertex is proportional to its degree - 
regardless of the structure of G. This remarkable fact 
is the key to numerous applications involving Markov 
chains. Crucially however, the networks we consider 
are directed and so in general, no such simple closed- 
form expression exists for w. 

Random walks have been studied extensively and 
have numerous applications in distributed computing- 
including self-stabilizing networks and token managc- 
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ment PQ. In the last decade there has been substan- 
tial interest in using random walks to construct rank- 
ing schemes in networks, the most obvious being the 
pagcrank scheme used by Google to rank web pages 

EH- 

In this paper we consider distributed algorithms 
that compute properties of the harmonic random 
walk on some network G, with the caveat that the 
algorithms must use G both as a communication net- 
work and as an input, where each node initially knows 
only its local edges. 

Fix some node u. We say a distributed algorithm 
computes a value x iff it terminates with at least one 
node knowing x. The communication complexity of a 
distributed algorithm is the total number of bits sent, 
over all edges. The communication complexity of 
computing a value x is the minimum communication 
complexity of any distributed algorithm that com- 
putes x. Our main aim is to show good lower bounds 
on the communication complexity of distributed algo- 
rithms computing tt(u), and some related problems. 
Note that, since the harmonic random walk uses only 
rational probabilities (the reciprocal of the outdegrec 
of a node), the stationary probabilities arc also ratio- 
nal and can be computed using finite precision. Our 
problem, then, is not how to efficiently approximate 
a real number, but how to understand its inherent 
complexity based on the network topology that gen- 
erates it. 

Our main result is a series of lower bounds that 
suggest that, in the worst-case, one can do better 
than the trivial centralized algorithm, and in some 
cases randomization may be of help in reducing the 
communication complexity. 

1.1 Technique 

Our main technique for proving lower bounds is to 
consider a related two-party communication com- 
plexity problem: partition the nodes of G into G\, G2 
and let Alice know G\ and Bob know G2, and choose 
some node u G G. We first lower bound the num- 
ber of bits Alice and Bob must exchange in order 
to compute n(u) in the worst-case (see [3] for exam- 
ples of this technique). Then we lift the result to the 
distributed case by replacing the cut (G 1,(^2) by a 



linear array, 'stretching' each edge of the cut by as 
many edges as possible while maintaining O(n) nodes 
in the network. Because of this, many of our worst- 
case instances resemble the 'barbell graph'. To do 
this we appeal to the linear array conjecture [H [5] as 
follows. 

If we can show that there exists a class of graphs 
having a cut where at least Q(n) bits must be com- 
municated across this cut, and if the cut is sparse 
(contains 0(1) edges), we can replace it by a line of 
n edges. The question 'does this increase the com- 
munication complexity by a factor n?' is the linear 
array conjecture [4]. The answer is that the random- 
ized communication complexity increases by a factor 
kn, where k is some constant less than 1. In other 
words, each of these edges must see f2(n) bits. 

We shall prove all our two-party lower bounds by 
reduction from two main known problems. For the 
purely information-theoretic bounds, we reduce from 
set-disjointness: Alice and Bob each have a subset 
P, Q of {l,...,m} and they must decide whether 
PnQ = 0. The randomized communication complex- 
ity of disjointness is Q(m) bits, for any protocol that 
decides with error probability less than 1/3. Some 
of our results give bounds much stronger than the 
information theoretic results, but only for determin- 
istic algorithms. For these results we use the prob- 
lem greater-than: Alice and Bob each have an m-bit 
number P, Q and they must decide whether P > Q. 
Any deterministic protocol must have communication 
complexity f2(m) bits, yet any randomized protocol 
must communicate at least f2(log?i) bits. Other re- 
sults and proofs in communication complexity can be 
found in the book [3]. 

Our technique resembles that of Tiwari [6] , where 
the network is simulated on a linear array and then 
one can use a reduction from a known bound on the 
complexity of the function when computed on a linear 
array of processors. Our work differs from Tiwari's 
however, since we require that algorithm must use the 
network both for communication between processors 
and as the input. For this reason we cannot consider 
the function to be computed and the network that it 
is to be computed on as two separate problems. In 
particular, there appears to be a tradeoff between the 
strength of a two-party lower bound and our ability to 
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'lift' it onto a larger network to obtain good bounds 
in the distributed case. In this sense, our problem 
is similar in spirit to that of Hakowiak et al. [7] 
who consider the problem of distributedly comput- 
ing maximal matchings of a network (although they 
consider upper bounds on the time complexity rather 
than communication) . 

We believe that strengthening our lower bounds 
requires a different technique. This is because we 
believe that the two-party lower bounds are tight if 
the cut is sparse, but the lifting onto a linear ar- 
ray only amplifies the result by a factor of 0(n). In 
other words, the two party bound is replicated across 
at most a linear number of edges, and yet there are 
potentially 0{n 2 ) edges in such a network of low ex- 
pansion and high diameter. 

1.2 Related work 

Aside from being an interesting problem in its own 
right, the problem is an abstract model underly- 
ing several basic problems in distributed computing, 
for example randomized routing, self-stabilization, 
network flow [5] and load balancing (where a node 
offloads work to its neighbours in proportion to 
their difference in current workloads). Although dis- 
tributed and parallel algorithms have been used to 
solve Markov chains simulating large systems, for ex- 
ample queuing or communication networks, our set- 
ting is different - the Markov chain that we wish to 
sample from is defined precisely by the communica- 
tion network on which it needs to be computed. 

The problem also underlies the pagcrank algorithm 
[2] for web page ranking. Here, G represents the web 
graph where each node is a web page and an edge rep- 
resents a hyperlink between two pages. The pagerank 
of G is defined as the stationary distribution of the 
harmonic random walk on G 1 , where G 1 is obtained 
by adding a 'reset' transition from every node to ev- 
ery node in some root set S. In a problem of this 
scale, distributed computing in the network could 
be extremely valuable it is thus useful to undcr- 

J For pagerank, the nodes do not map directly to network 
nodes, as there are usually many pages on a single site. How- 
ever, even if we collapse all pages on a site to a single page 
and abstract to the level of network nodes, the problem is still 



stand the communication requirements of algorithms 
for this problem. By adding the reset transitions to 
a large enough set S, it is possible to show that the 
harmonic walk on G' is rapidly-mixing, and hence 
any iterative algorithm for computing the pagcranks 
will converge quickly, typically in O(logn) iterations. 
As the results in this paper are primarily worst-case 
results, they are unlikely to be tight for this partic- 
ular problem. However, we feel that it is important 
to understand the worst-case complexity of the more 
general problem as we define it. 

The desirable properties of Markov chains (for 
example, based on the known stability of princi- 
pal eigenvectors under small perturbations) have 
led to them finding new applications in distributed 
web searching [9l [10], distributed 'reputation' sys- 
tems |llj and many other problems that can be ex- 
pressed as finding the stationary distribution of a 
Markov chain on some network. Although several 
distributed algorithms have been proposed for pager- 
ank [HI [TUl [HI El EE 03], to the best of our knowledge 
nothing nontrivial is known about the communica- 
tion complexity of the problem, nor these algorithms. 

In trying to model more faithfully browsing be- 
haviour on the world-wide web, Fagin ct al. [T3] in- 
troduce backoff processes. Such a process can be de- 
fined by a graph G (and its harmonic random walk), 
and for each node, a backoff probability p, where at 
each time step with probability 1 — p, the token moves 
forward as defined by the walk, and with probability 
p it returns to its previous state. They show some 
interesting phenomena that are induced by this pro- 
cess, for example it does not always have a limit dis- 
tribution independent of the starting state, even if 
the underlying chain is crgodic. It would seem inter- 
esting to extend our results to obtain lower bounds 
for the complexity of these processes. 

Fogaras and Rasz [T5] consider a related problem 
known as 'personalized pagerank'. The personalized 
pagcrank for a node u is defined as the unique sta- 
tionary vector 

ir(u) = (1 — c)Pir(u) + cu 



sufficiently large that communication bottlenecks could be sig- 
nificant. 
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where u u = l,u„ = for v ^ u. They prove 
simple lower bounds on the size of the database re- 
quired for a centralized server to be able to answer 
queries about ir(u) (for all nodes u), in both ex- 
act and approximate models. Like our results, their 
lower bounds utilise communication complexity ar- 
guments, but they only require reductions from one- 
way communication complexity rather than the two- 
way results we require to lower bound arbitrary dis- 
tributed computations. Because of this, their results 
are purely information-theoretic, yet for some of our 
problems we are able to give much stronger results 
than the information-theoretic bound, by showing 
how we can use the network to 'do work' for us. 

As far as we know, we are the first to consider 
communication complexity lower bounds for prob- 
lems where the network structure itself forms the in- 
put, and the output depends nontrivially on its struc- 
ture. There has, of course, been progress with prov- 
ing lower bounds for problems in distributed com- 
putation. Abelson [TB] obtained the first nontrivial 
lower bounds for a distributed protocol to solve a sys- 
tem of linear equations, although his result applies to 
differentiable real-valued functions. The lower bound 
is based on showing that the matrix that describes 
the system has sufficiently high rank that any pro- 
tocol must make a large number of choices to locate 
the solution. This is related to the well-known 'fool- 
ing set' lower bound technique now a staple part of 
communication complexity. However, Abelson's re- 
sult gives a lower bound on the number of values that 
must be communicated whereas we give information- 
theoretic results on the number of bits that must be 
communicated over a network which also forms part 
of the input. 

1.3 Summary of our results 

In section [2] we consider the case where the network 
G is undirected and unweighted. The harmonic ran- 
dom walk induced is then that of a reversible chain. 
For these graphs we show that there is a simple op- 
timal algorithm to compute the stationary distribu- 
tion. This result shows that reversible chains can 
only encode local information about the graph into 
the stationary probabilities. 



Next we consider the case where G is directed. Our 
results show that, unlike the undirected case, inter- 
esting structural properties can be encoded into the 
stationary distribution, and understanding the na- 
ture of this is our main tool in obtaining good com- 
munication complexity lower bounds. In Section[3]wc 
prove a lower bound on the total communication re- 
quired for any distributed algorithm to compute the 
stationary probability of a single node in the graph. 
The motivation for this result is that, in a large dis- 
tributed network, it is not efficient to compute the 
values for all nodes if only one node requires its value. 
As we show, even though the stationary probability 
of a single node depends on the stationary probabil- 
ity of all other nodes, our results suggest that we can 
compute a single probability from scratch, at a lower 
cost than computing all the values. 



We then consider variants of the basic problem. In 
Section 31 we look at computing the entire stationary 
distribution and prove that, asympototically, there 
are the same number of distinct principal eigenvec- 
tors as there are unweighted graphs. In Section [5] 
we turn to the problem of approximately computing 
stationary probabilities. Currently we know of no 
distributed approximation algorithm that achieves a 
specified approximation factor, yet this appears to be 
a useful practical problem. In Section [6] we consider 
an interesting variant of the problem: computing the 
rank of a single node in the stationary distribution 
7r. We prove a communication lower bound for com- 
puting the node with maximal rank, and whether a 
node has even or odd rank (which implies a bound 
on computing the actual rank). 



Finally, in comparing our results to those for com- 
puting shortest path lengths, we use the elegant path 
algebra framework of Gondran and Minoux [17] to 
formalize the problems, and discuss the complexity 
results in terms of algebraic properties of the prob- 
lem. This appears to be a novel approach to account- 
ing for the complexity differences. 
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2 Bounds for reversible chains 

A Markov chain is reversible iff it satisfies the detailed 
balance equations 

ir(u)p U v = n(v)p vu 

i.e. the probability flux between any two nodes is 
the same in both directions. In particular, if G is 
undirected then the harmonic random walk on G is 
a reversible Markov chain. Reversible chains have 
many remarkable properties. In particular, the key 
to efficient computation on reversible chains is the 
following: the stationary probability of any node is 
proportional to its degree - regardless of the structure 
of G. More precisely, if G is undirected then using 
the detailed balance equations, it is easy to verify 
that ir(u) — degree(M)/2|£'| is indeed a stationary 
probability, and if G is ergodic then this is unique. 
Hence the stationary probability n(u) is determined 
solely by \E\ and the degree of u. If G is not reversible 
then in general there is no such local expression for 
computing ir(u). 

A simple algorithm for computing ir(u) is then as 
follows: given a spanning tree T of G, each node 
computes the sum of the degrees of the nodes below 
it in the tree in a depth-first manner. There are n — 1 
edges in T and each edge carries at most 0(logn 2 ) = 
O(logn) bits, hence this algorithm sends O(nlogn) 
bits in total in the worst case. The following simple 
theorem shows that this is asymptotically optimal. 

Theorem 1 Any distributed algorithm that com- 
putes 7r for an n-node reversible chain must commu- 
nicate f2(logn) bits overQ(n) edges in the worst case. 

Proof. We will show an information-theoretic bound 
on the communication required between two sides of a 
cut in a sufficiently large class of graphs. Consider an 
undirected graph G with n nodes u\, . . . , u n , and add 
an extra node v with an edge [u n ,v). We will consider 
the amount of communication required between u n 
and v, i.e. across the edge (u n ,v). 

Since the degree of v is fixed and the harmonic 
walk on G gives a reversible chain, only the number of 
edges between the Ui affects the stationary probabil- 
ity of v. Since there are ^((2) — n ) = Q(n 2 ) strongly- 
connected graphs with a distinct number of edges, 



this gives fi(n 2 ) different possible values of n(v). If 
each one is equally likely then at least f2(log(n 2 )) bits 
must cross the edge (u n , v). 

Now imagine replacing the edge (u n , v) by a linear 
array of n edges. As described in Section [TTT1 we can 
lift our lower bound onto the linear array with an 
increase by a factor kn, for some constant k. Hence 
at least fi(logn) bits must flow over at least f2(n) 
edges in the worst case. □ 

It seems that the requirement of reversibility pre- 
cludes the existence of an interesting relationship be- 
tween the structure of the graph and the stationary 
distribution of the harmonic walk on it. 

3 Directed Markov chains 

In the remainder of the paper, we consider directed 
graphs. The Markov chain may not have a station- 
ary distribution and certainly does not have a simple 
closed form, as for undirected graphs. 

A Markov chain is said to be irreducible if for all 
u, v there is a positive probability of the token reach- 
ing u from v. Assume that the chain is irreducible. A 
fundamental result is that there exists a unique sta- 
tionary distribution tt = (tt(1), 7r(2), . . . , 7r(n)) with 
J2u 7r ( u ) = 1 that satisifies the balance equations 

7t(m) = ^ — Tr(v). 

outdegrec (v) 

We shall therefore assume that G is strongly- 
connected and non-bipartite, as this will guarantee 
the existence of a stationary distribution. 

Let p uv = 1/ outdegree(u) be the transition proba- 
ta) 

bility from state u to v, and p U v be the probability 
of the token being at v after exactly k steps, start- 
ing from u. A state u is recurrent if it is visited 
infinitely often in an infinitely long walk, and ape- 
riodic if gcd{/c : pitu > 0} = 1. Recurrent, ape- 
riodic states are said to be ergodic. An irreducible 
chain whose states are ergodic is said to be ergodic. 
If the chain is ergodic then in addition, the limit 
limjfe-Kjo pji* = 7r(j) exists and is independent of i. 
This forms the basis for iterative algorithms that 
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compute 7r. Since we want to lower bound the com- 
munication of any algorithm that computes n, we 
shall not require ergodicity, but only that the sta- 
tionary distribution ir exists. 

3.1 Lower bound 

We now show an information-theoretic lower bound 
on the communication complexity of computing the 
stationary probability of a single node. Assume that 
the graph G = (V, E) is unweighted (hence w uv = 1 
iff (u, v) G E and otherwise) and directed, and has 
n nodes. 

Theorem 2 Any distributed algorithm, randomized 
or deterministic, that computes ir(u) for some u must 
communicate at least Q(n log n) bits over Sl(n) edges 
in the worst case. 

We will first prove an Q (n) bound for sparse graphs 
then show how it can be improved to f2(nlogn) in 
the case of dense graphs. 

The following two lemmas establish lower bounds 
on the communication complexity of a two-party 
version of the problem by reduction from set- 
disjointncss. We show that, without any communica- 
tion Alice and Bob can construct a graph G (where 
the edges are partitioned between Alice and Bob), 
such that knowing the stationary probability tt(u) for 
some node u allows them to solve disjointncss. 

Lemma 1 The randomized communication complex- 
ity is n(n) bits in the case of sparse graphs. 

Proof. Given two n-element sets P, Q C {1, . . . , n}, 
Alice and Bob construct a sparse graph G as follows. 
Because the stationary value of u will reveal the entire 
set P, Bob need not encode anything, and so his part 
of the graph is constant. 

The graph contains n nodes xi , x<i , . . . , x n on a cy- 
cle x i — > X2 — > • • • — > x n — ► X\, and nodes u and 



There is a sink node x a , with edges x a 



and 



{u,u'} — > x a . Finally, add two nodes v and v' with 
edges u <-> v and u' «-» v' . The cut of the graph shall 
be (G,G — {v, v'}). For each element j G P, add an 
edge Xj — > u and for each element j ^ P, add an 




Figure 1: Construction for Lemma [T] when n = 4 and 
P = {2,3}. 

Figure [TJ Intuitively, as the stationary probabilities 
halve on each step in the cycle, the binary expansion 
of ir(v) should reveal P. We shall use w(j) to denote 
the stationary probability of Xj, and similarly 7r(a) 
for x a - Now, the flux drops exponentially along the 
cycle so we have 



_ on-j 



7r(n) 



Consider the stationary probability tt(v) 
1 



7r(i>) 



-7t(u) 



edge Xj 



The construction is illustrated in 



jeP 

= 7T(n)^2-^ 1 

For node v to be able to obtain P from the binary 
expansion of ir(v), the value ir(n) must be a constant, 
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independent of the sets P. Note that this does not 
follow trivially; for example, if we replaced the edge 
between u',v', with a self-loop at v! (or even with 
nothing), then n(n) would depend on P. The reason 
for this is that there must be a flow of constant flux 
(independent of S) from the nodes on the cycle, and 
back to x a , and hence ir(u) + ir(u') must also be a 
constant. We now show that for our construction, 
this is indeed the case. For simplicity, let s = tt(u) + 
ir(u') and t = ir(v) + n(v'). Then 

s/2 = Tr(a) = 1/2(tt(u) + tt(m')) 
n 

= -7r(n) V 2 n - J + -t 
4 w ^ 2 

0=1 

= I 7r (n)(2»-1) + ^ 



and 



t=i(7r(«)+7r(«')) = \s 



i(2» - l)Tr(n) + \t 



t 



1 



(2" - l)Tr(n) 



Since 7r is a probability distribution, it must sum to 
unity: 



ir(a) + s + t + ir(n)(2 n - I) = 1 
ls + t + n(n)(2 n - 1) 



-(2" - lW(n) + -t 
A ' ' 2 

3(2" - l)7r(n) 
7r(n) 



3(2™ - 1) 



Hence ir(v) = cj^jep^ - ^" 1 f° r some constant c. 
Now, suppose node v computed tt(v) in this graph. 
We can assume that it knows c (as it depends only 
on n). Then it can read off the n-bit set P from 
the binary expansion of tt(v) (note that the largest 
element of P is represented by the least significant 
bit of tt(v)). Since the randomized communication 
complexity of set-disjointness is fi(n) bits, at least 
this many bits must cross the cut between Alice and 
Bob. □ 



More precisely, the construction defines a class of 
sparse graphs where any algorithm that computes 
n(v) with probability of error at most p allows some 
node to learn an n-bit set with the same probability 
of error. 

Now we show how the two-party lower bound can 
be improved to Q(nlogn) bits in the case of dense 
graphs. The idea is that, instead of each node on the 
cycle encoding a single bit, each node can encode a 
small set of size O(logn) bits, since it can potentially 
transfer O(n) different proportions of its flux to the 
node u. These small sets are now encoded in blocks 
of O(logn) bits into ir(u). 



Lemma 2 The randomized communication complex- 
ity is tt(nlogn) bits, for dense graphs. 



Proof. Take the previous construction, and add 2n 
nodes z\, z[, . . . , z n , z' n , where each z\ links to u and 
each z\ links to u' . Now partition the set P into log n- 
element sets Pi , . . . , P n where Pj will be encoded by 
node Xj. Each node Xj on the cycle links to exactly 
n + 1 nodes: It links to Xj+i, and for each i, it links 
to either Zi or z[. 

Since the edges are unweighted, each edge con- 
tributes the same flux from a node, hence each node 
Xj can now give n different fractions of its probabil- 
ity flux to u, via the z,;. The intuition is that each Xj 
can independently encode a set of O(logn) elements, 
allowing us to encode 0(n log n) bits of information 
into ir(v). As before, we also need to show that tt(u) 
is still constant. 

The flux now drops by a factor n+ 1 on each step of 
the cycle, so tt(j) = ^7r(j - 1) = (n + l)"^%(n). 
Hence £™=i n(j) = n(n)((n + 1)" - f). Let d(Pj) 
denote the value of the binary expansion of the set Pj 
(where < d(Pj) < 2 |p ^). Then each node Xj links 
to exactly d{Pj) of the Zi (and hence exactly n-d(Pj) 
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of the 2:,'). Consider the stationary probability w(v): 



1 n 1 

2 n + 1) ^ VI \ 3)t 2 w 

3=1 

1 " 
3=1 



Note that we have taken care that in both construc- 
tions, the cut (X, Y) is sparse and so we can build a 
modified 0(n)-node graph by replacing each edge in 
this cut by n edges. 

We now appeal to the linear array conjecture and 
so in the worst case at least fi(nlogn) bits must flow 
over each of these edges (using the constant in the 
lifting of randomized communication bounds onto a 
linear array). 



(1) 3.1.1 Remarks 



3=1 



Now we show that the flux crossing the cut is in- 
deed constant. As before, let s = tv(u) + tt(u') and 
t = tt(v) + tt(v'). Then 

-s = Tf(a) = ^(tt(u) +n(u')) 

= 27^TT)^ )(( " + 1) "- 1) n < 

and 



t=^(7T(u)+7T(u')) = -S 



n + 1 



7r(n)((n + l) n -l) 



Since tt is a probability distribution, it must sum 
to unity: 



5 n 



2n + l 



((n + ir-l)(|-J- + l)7r(n) 
I n + 1 



and so 7r(n) is a constant c independent of P. Hence 
ir{v) = c£" =1 (n + l)"-- 7 '" 1 ^). Since the sets P,- 
are of O(logn) bits, d(Pj) < n and hence tt(v) reveals 
all the values d(Pj), and hence the n sets Pj. □ 

The previous lemmas have given us lower bounds 
for the transfer across the cut between two parties, 
where Alice and Bob each know only their subgraph. 



The expansion of ir(v) represented by Equation [T] 
gives a clue as to why we cannot hope to improve 
the lower bound using our current methods. See that 
it can be roughly rewritten as (ignoring constants) 

£j=i (kY gU d(Pj)- Hence each Pj only has « lgn 
bits in the expansion of n(v), even though it is quite 
easy to build a construction where the Pj 's can be of 
n bits, and in this case the sets cannot be recovered 
since they begin to overlap in the binary expansion 
of tt(v). 

The result also gives lower bounds on the worst- 
case congestion and time incurred by any algorithm 
to compute 7r(w).For congestion, the stretching trick 
means that there must be at least a linear number 
of edges that each have fi(nlogn) bits communi- 
cated across them. For a time lower bound, there 
are f2(n,logn) bits that must cross a cut of size O(l), 
hence any algorithm must take time Q(n\ogn) in the 
worst case. 



-s + t + 7r(n)((n + l) n -l) = 1 



((n + l) n -l)7r(n) + ((n + l)"-l)7r(n) = 1 



3.1.2 Labelling schemes 



A simple and interesting corollary of the two-party 
dower bound in this section is a lower bound on the 
complexity of a labelling scheme. A distance labelling 
scheme for a graph G is an assignment of labels l(v) to 
nodes v of G such that, by examining only the labels 
l(u),l(v), one can determine the distance d(u,v) be- 
tween u and v. By encoding global information about 
a graph into local labels, labelling schemes have many 
practical applications in large-scale distributed net- 
works |18j . A hitting time labelling scheme is the 
natural extension of a distance labelling scheme to 
the random walk on a graph: given l(u),l(v), one 
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can compute the expected length E[d(u,v)] of the 
random walk beginning at u and terminating on first 
hitting v. 

Since ir(u) = 1/E[d(u, it)], Lemma [5] implies that 
for any hitting time labelling scheme, there must be 
some graph where some node must be assigned a label 
of size f2(nlogn) bits (in particular, this must occur 
for some node computing the expected time for the 
token to return to itself). Clearly there is an upper 
bound of 0(n 2 ) on the size of labels (by encoding the 
whole graph into each label) , but the aim of efficient 
labelling schemes is to do much better than this. In 
[TS] it is shown that 8(n) bits is the optimal distance 
label length for general unweighted graphs, so our 
result shows an increase in complexity, but we do not 
know if the increase is more than just a logarithmic 
factor. 

3.1.3 Upper bounds for computing it 

A simple algorithm to compute 7r(it) would be to con- 
struct a spanning tree T of G rooted at node it, and 
for each node to send a description of its edges to u 
using T. For general unweighted graphs, this would 
require 0(n 2 ) bits being sent over 0(n) edges in to- 
tal. Constructing a distributed algorithm with o(n 3 ) 
bits worst-case total communication appears to be 
a challenging problem. We conjecture that the true 
lower bound is 0(n 2 polylog(n)) but have been unable 
to prove this for an algorithm. We also believe that, 
for the problem of exactly computing the stationary 
probabilities, randomization is of no help as regards 
worst-case communication complexity. 

For the related shortest paths problem, there is a 
long and interesting history of efficient algorithms, 
both sequential and more recently, distributed. Un- 
derstanding these may help in obtained nontrivial up- 
per bounds for the path problems we consider here. 
The best known communication complexity upper 
bound in the distributed case is 0(n 2 log 2 n) bits 
and relies on a graph decomposition to represent the 
graph as a partition of sparsely-connected clusters 
[TO] , It is reasonably easy to show that any dis- 
tributed algorithm that computes the shortest path 
lengths must have worst-case communication com- 
plexity Q,(n 2 logn) bits (there exist graphs where the 



length of the path to each node requires fi(logn) bits, 
and each must be sent over fi(n) edges). Hence its 
communication complexity is fairly well-understood. 

4 Computing the entire distri- 
bution 

In this section we consider the slightly different prob- 
lem of some node v knowing the entire vector tt of 
stationary probabilities. This may correspond to a 
distributed crawling algorithm that terminates with 
some centralized server v knowing the entire vector. 

Our results for this section illustrate an interest- 
ing weakness of our lower bound technique. In the 
two-party case, we prove that the trivial algorithm 
of sending the entire graph is optimal (and so we 
cannot do any better here), but since the cut be- 
tween the Xi and i/j nodes is dense, we cannot am- 
plify our bound by lifting onto a linear array (or other 
sparse structure). Therefore we only obtain an Q,{n 2 ) 
bound in the distributed case, even though the trivial 
(spanning tree) algorithm costs 0(n 3 ) in this model. 
Improving this situation with our current technique 
would involve finding a construction with the same 
two-party complexity but with a much sparser cut, 
say with a constant or (poly) logarithmic number of 
edges. We feel that an 0(n 2 ) bound is not possible 
with this number of edges. 

The intuition for an information-theoretic lower 
bound might be something like this: each edge in 
the graph can alter the vector w, therefore there are 
2™ / 2 possible vectors, so f2(n 2 ) bits must be commu- 
nicated. Of course, this is nontrivial because while 
each edge does indeed change n, it is still possible 
that many combinations of edges result in the same 
7r. For example, an n-clique and an n-cycle (adding 
chords to make it ergodic) both have the same it. So 
what we need to prove is a bound on the number of 
distinct vectors n (or, the number of distinct principal 
eigenvectors of a set of n-node graphs). 

We prove the lower bound by exhibiting a family 
of n-node graphs with 2 nl - n > distinct principal eigen- 
vectors. 

Theorem 3 There is a family of n-node directed, 
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unweighted Markov chains with 2" distinct station- 
ary vectors tt. 

Proof. The construction is as follows. There are 
nodes Xi,x 2 , ■ ■ ■ ,x n ,y 1 ,y 2 , ■ ■ .,y n , y[,y^ ■■■,y'n and 
a sink node s. The edges are as follows: 

• a cycle x\ — > x 2 — > X3 — > x n — > xi 

• s — ► xi, to get an exponential dropoff on the 
cycle 

• Vi —* s and y[ — » s, for all i 

Finally, call a matrix V = Vij legal if v\j = 1 and 
i'2j = for all j. For each entry v^, if = 1, add 
the edge Xi — > ?/•,■ and if «y = 0, add Xj — > y'j. First, 
note that this graph is strongly connected for all legal 
matrices (this is why we forced the first row elements 
to 1, second row to 0). The construction is illustrated 
in Figure O 

We now show that each legal matrix gives a differ- 
ent vector tt, over the nodes j/j. Firstly, see that if we 
flip one bit Vij, the only values that change are ir(yj) 
and Tr(y'j). Now we show that each distinct vector 
(i>3j, v^, . . . , v n j) gives a different value for 7r(j/j). 

To the contrary, assume there are two vectors 
v ^ v' with ir v (yj) = ir v (yj), where n v (yj) is the 
stationary probability of yj under the vector v (the 
jth column of the matrix ) . By definition, 

1 - 
^ v {Vi) = -(tt(xi) + }ViTr(xi)) 

i=3 

n 

= c + v,iir(xi) for some constant c 

i=l 
n 

= c + ^^u^7r(xi) by assumption 

i=l 

that is, if ir v (yj) = 7r 1 ' (yj) then there must be two 
distinct subsets of (w(xi),iv(x2), ■ ■ ■ , 7r(x„)) that sum 
to the same value. But this is impossible since the 
values fall off exponentially (with the same factor) on 
the cycle construction. 

This proves that each legal matrix gives a different 
vector 7r, therefore the number of different vectors tt 



is equal to the number of different vectors (uy ) , which 
is 2«(™- 2 ). □ 

The above lemma shows constructively that there 
are a family of n-node graphs (the whole construc- 
tion) with 2 n (™ ) distinct stationary vectors (taken 
over the O(n)-nodo subgraph of the yj's) and so any 
node that is to know this vector must have at least 
Q.(n 2 ) bits communicated to it in the worst case. 

5 Approximate computation 

In this section we turn to the natural problem 
of approximating the stationary probabilities n(v). 
Firstly, see that we must be careful with our notion 
of approximation: if tt(v) is to be computed to within 
k bits of precision, we can use the construction of 
Lemma Q] to encode a set of size 0(k) bits rather 
than 0(n) bits, and the lower bound is accordingly 
reduced to fi(fc) bits across f2(n) edges. A more natu- 
ral notion of approximation may be to compute tt(v) 
to within a given factor. Call tt(v) a fc-approximation 
to 7r(v) if r7i"(^) < it(v) < kir(v). In this section we 
prove that any distributed algorithm that computes a 
/c-approximation to tt(v) for some chosen v (even with 
high probability) must send send at least O(log j^jg) 
bits across Sl(n) edges in the worst case. 

First, let us examine the case k = 2. See that the 
difficulty with using our original binary encoding of 
the set is that, in the binary representation, all the 
bits of lower order than the highest order bit can be 
changed arbitrarily while remaining within a factor 
2. The basic idea is to use a simple error correcting 
code that resists changes to a numeric factor of 2. 
Just using the highest order bit is not sufficient, since 
for example 0100 can become 1000 (an increase by a 
factor 2) and 0010 (a decrease by a factor 2). A 
simple solution is to pad out the expansion, using 
blocks of length 3 bits, for example 

•••00 010 00- •• 
a block 

Then the highest order bit can never fall out of its 
block. In the lower bound, we use blocks of length 
l + 2[lgfc] bits to withstand factor k approximations. 
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Figure 2: Construction for Theorem [3J 



The idea then, is to use a variant of the construc- 
tion from Lemma [2] to produce a binary string where 
the set S is encoded into the position of the highest 
order bit of the 'blocked' binary expansion. Just con- 
sidering encoding a single set S, we are looking for a 
binary expansion of the form 

0... 00010 OCK^T) 

d(S) 

where the 1 is at some position, determined by the 
value (as previously defined) d(S) of the set. Encod- 
ing O(logn) elements requires O(2 logn ) = 0(n) pos- 
sible indices in a binary expansion, which is exactly 
what we have available from the original construc- 
tion. 

To encode an O(logn) bit set S, we compute 
d(S) = X^jes ^ and then use the node Xd(s) to en- 
code this value by linking it to u, and all other nodes 
to u'. The claim is that the block containing the high- 
est order bit (and hence the encoding of the set) of 
the binary expansion of n(v) can be recovered from 



the binary expansion of ir%. Let us now prove the 
main lower bound. 

Theorem 4 Consider any distributed algorithm 
computes tt(v) with Pr(^7r(u) < tt(v) < kir(v)) > 
2/3. Then it must send fl(\og -^j) bits over O(n) 
edges in the worst case. 

Proof. We extend the idea outlined above to 
an approximability-preserving reduction from set- 
disjointness. The main difference is the 'block cy- 
cle' construction: for each node xi, . . . ,x n on the 
cycle, replace it with a block of K = (1 + 2[lgfc~|) 
nodes where Xi is now the center node of the ith 
block (and the rest are dummy nodes). We will show 
how to encode a 0(log n) bit set S using 0(nn) nodes 
in the construction. Imagine that the set has value 
d(S) = j. Now, pick the node Xj on the cycle and add 
an edge Xj — ► u, and for all other nodes on the cycle 
(including the dummy nodes) add an edge to v! . The 
block construction will let v recover the value j under 
a fc-approximation. The ratio between successive x^s 
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on the cycle is now 2 K , hence 

1 

A bit of algebra establishes that p n is indeed a con- 
stant p n = l/(3(2 nK — 1)). Consider now the actual 
value 7r(u). Imagine encoding the set S. Let j be such 
that the node Xj will represent this set (as described 
above). Then 

p v =p u /2 = pj/2+p v /2 
= Pi/2 

nn(n-j)-l 

1 l 

_ 3 2 kj+i 

Now v can find j easily, since 2 K ° = (2 1+2 r igfc l )" J = 
(2k 2 ) j , and log 2fc2 (2fc 2 )-? = j. Now we claim that the 
block containing the highest order bit is the same in 
both 7r^ and tt(v), by showing that the values must 
be separated by a factor of least 2k 2 . Assume the set 
being encoded has value j, hence (ignoring constants) 
the largest fc-approximate value of p v is 

i 1 * 

kPv ~ 2i W 
1 1 

Now if the set had value j — 1 instead, then the small- 
est fc-approximate value would be (letting w' v be i>'s 
stationary probability using the set with value j — 1) 



Pvl 23- 1 jfc2(j-l) + l 

= 2 i^- 

2J fc 2 -'" 1 
= 2kp v 

These two values are the closest possible 
(a fc-overapproximation with set j and a k- 
underapproximation with set j — 1) and are still sep- 
arated by exactly one bit in their binary expansion, 



and so there is no overlap and the value of the set 
can be recovered. Intuitively, the binary expansions 
of 7r,y and tt(v) look like the following, for a set S with 
value d(S) = j. 

ix k v ■■ -000 0110101011 010010- ■■ 

block j j — 1 

n(v) ■■■ 000 00100 00000000000 • • • 

block j 

where each block has k bits and the highest order bit 
of 7r^J is contained in block j iff the highest order bit 
of tt(v) is in block j. 

For the communication complexity bound, see that 
there are only a constant number of edges cross- 
ing the cut, and so any protocol that computes a 
fc-approximation % with probability p allows us to 
solve O(logTi) disjointness with 0(tik) = 0(n log fc) 
nodes, with the same probability. Hence, for a graph 
of n nodes we can solve f2(log(n/ logfc)) = f2(logn — 
log log fc) disjointness. The result follows since the 
cut is sparse and we can appeal to the linear array 
result, and by the communication complexity of set- 
disjointness. □ 

5.0.4 Remarks 

As before, the result yields analogous time and con- 
gestion lower bounds. Also, note that 0(logn — 
log logfc) = fl(\ogn) for constant fc. It may be in- 
teresting to investigate what happens for fc = 1 + e. 

6 Computing the ranks 

We say that a node u has rank(u)=fc iff there are ex- 
actly fc nodes \y\, . . . ,Vk} with stationary probabili- 
ties at least as large as u: ir(u) < n(vi) for i = 1 . . . k. 
Hence u has maximal rank iff rank(w)=l, and it has 
minimal rank if rank(w) = n. 

In this section we consider the difficulty of com- 
puting the rank of some node. Clearly if there are n 
nodes then the rank of a node can be expressed with 
O(logn) bits, unlike the stationary probability, which 
by Lemma[5]can require £l(n log n) bits. On the other 
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hand, knowing that rank(u) = k implies that some 
node must know there are exactly k — 1 nodes hav- 
ing larger stationary probability and n — k + 1 nodes 
having smaller stationary probability. 

We now investigate the case where some algorithm 
terminates with many nodes in the network knowing 
their ranks - again we prove a lower bound via a two- 
party argument and a lifting of the lower bound onto 
a linear array. In the two-party case, both Alice and 
Bob will need to know the ranks of Cl(n) nodes in 
each of their subgraphs. 

In fact, the lower bound holds when only the par- 
ity of the ranks are known, i.e. whether a node's 
rank is odd or even. The lower bound shows that 
at least f2(n 2 ) bits must be communicated in total. 
Although our original intention was to prove a result 
for knowing the exact rank, we have been unable to 
strengthen it beyond the result here. 

Given this statement, a natural question is 'what 
use is knowing whether your rank is odd or even, 
since it doesn't imply anything about knowing how 
many nodes have larger or smaller stationary value 
than you (except for the parity of this number)?' We 
feel that presenting the result in this form exposes 
more details about the problem, and our proof. 

On the other hand, if each node knows whether it 
has even or odd rank, this may provide a useful par- 
titioning or symmetry breaking of the network into 
two pieces where the total stationary probability in 
each piece is approximately equal (since if one side 
has the maximum node then the next largest will be 
in the other side). An interesting thing would be to 
determine if this can be done without explicitly com- 
puting the ranks first. 

The following theorem shows that the communi- 
cation complexity of computing the rank parities is 
surprisingly large. 

Theorem 5 Consider any algorithm that terminates 
with each node Vi knowing rank{vi) mod 2. The 
communication complexity is Q{n), and this many 
bits must be sent over Q(n) edges in the worst case. 

Proof. To construct the two-party problem, partition 
the network into (X, Y) where Alice knows X and 
Bob knows Y, and form an 'exponential cycle' in X 



with nodes xq — > x\ — > • • • — * X2n+x — * %0i an d add 
an edge from the sink s — > .To. For each i, add edges 
x Q — > a\, x 2n +i — > a,i and a* — > s,a' t — > s. Now, 
for each i, if i £ P, add edges x 2 i — > a%, ^2i+i ~~ ► a 'i 
else add edges x 2 i — > a\, £2?:+i — * di- The point of 
the construction so far is that all the a' nodes have 
higher stationary probability than all the a nodes. 
The Y partition is exactly symmetric, with y%i — > 
&i,2/2»+i -> b'i if i G Q else y 2i -> b' i ,y 2 i+i -> h. 
Finally, connect the two partitions with a sparse cut 
by adding edges s •*->• t between the two sinks. 

So far, the construction is completely symmetric. 
But we want the 6, nodes in Y to have slightly lower 
stationary probabilities, so we add a self-loop at node 
s. The construction is illustrated in Figure [3] Now, 
we consider the rankings of each node in the con- 
struction. There are three claims: 

1. The rankings of the Xi,yi, s, z,t are independent 
of P, Q. This follows since the stationary proba- 
bilities of these nodes are constant (in the same 
way as for the construction of Lemma[l|, and for 
each i, rank(x2i) and rank(x2i+i) are always the 
same apart, i.e they are always separated by the 
ai,a[ nodes, and the yi nodes (which also have 
constant stationary probability) . Finally, adding 
the self-loop at s only changes the relative sta- 
tionary probabilities of the two sides, since both 
7r(s),7r(i) are constant. 

2. rank(a') < rank(aj) and rank(^) < rank(6j), for 
all This is because the a[ and b[ tap into 
the Xq and yo, which have the highest stationary 
probability on the two cycles. 

3. For any sets P, Q, rank(ai) is less than both 
rank(a,+i) and rank(&i + i), i.e 7r(a,:) is larger. 
Also, rank(&i) is less than both rank(oi+i) and 
rank(6i + i). This is because the probability flux 
drops off exponentially along the cycle. 

So, the only effect of the sets P, Q on the rankings 
of the ai and bi is to change the relative rankings of 
Oj and bi, for each i. 

Lemma 3 If i G P and i G Q then rank(di) < 
rank(bi). On the other hand, if i ^ P,i G Q then 
rank(ai) > rank(bi). 
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Bob 




Figure 3: Construction for lower bound on computing ranks (Theorem [5]). 



Proof. The first part follows from the fact that the 
dj nodes in Alice's half (encoding P) have higher sta- 
tionary values than the 6j nodes in Bob's half (due 
to the self-loop at s). For the second part, we need 
that the ratio ir{xi)/'K{yi) < 2. This is true because 
p s = 2/3p s + pt = 3/2p t by the self- loop at s and 
7r(ii)/7r( W ) - (7r( S )/3 l )(3*Mt)) = 3/2. □ 

Hence, for any i such that i £ Q, node bi can in- 
spect whether its ranking is odd or even, and accord- 
ingly determine whether i e P. From this, Alice can 
determine the set Q held by Bob, and then can de- 
termine the inner product of P and Q. Since we have 
managed to ensure that the cut is sparse, we can lift 
our result onto a linear array of size n and the lower 
bound follows. □ 

Interestingly, in the construction, it is easy for each 
a, node to compute its stationary probability, since 
it only depends on the structure of the X partition 
(and hence can be computed with no communication 
between the two partitions), but to compute the rank 
of eti is difficult because it depends on the structure 



of the other partition and there is an interplay be- 
tween the two sides of the network. So, although the 
nodes do not have to compute the stationary prob- 
abilities, they must still implicitly know something 
about the structure of the ordering of the stationary 
probabilities in the network. 

6.0.5 Remarks 

It may be possible to improve the bound by encoding 
information into the n\ orderings of the ranks of the 
cii nodes (which would intuitively allow us to solve 
the greater-than problem on f2(?ilogn) bit sets), but 
it appears a challenging problem to achieve this with 
only a sparse cut. Without a sparse cut, we would 
be unable to appeal to the linear array conjecture to 
lift the bound onto a network. 

6.1 Computing the maximum node 

In this section we consider the problem of comput- 
ing the node of maximal rank, i.e. for some v, an 
algorithm that terminates with at least one node u 
knowing if v is of maximum rank. 
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The node with maximal rank in a Markov chain 
is analogous to the center of a network in the short- 
est paths frameworlfl There appear to be several 
interesting applications for algorithms for finding the 
maximal rank node in a chain. For example, in a dis- 
tributed network one could select a node with max- 
imal rank to store a file, or to act as a leader of a 
subset of node^|. 

6.1.1 Lower bound 

We will prove that any deterministic distributed al- 
gorithm must send fl(n 2 logn) bits in total, and must 
cause congestion of O(nlogn) on at least fl(n) links. 
In the deterministic case, this is as strong as the lower 
bound of Theorem [2] for exactly computing the sta- 
tionary probability at v (although we have been un- 
able to show that deciding if v is of maximal rank is 
at least as hard as computing tt(v)). 

The output size of the problem is a single bit, so 
a simple information-theoretic bound would be far 
from strong. However, we can do much better by 
showing how the network can do some useful compu- 
tation for us. We first prove an easy lower bound on 
the complexity of the following problem. There are 
two nodes u, v G G, and some node (in particular, 
this could be one of u, v) wants to decide whether 
tt(u) > 7r(u). 

Lemma 4 Any algorithm that terminates with some 
node knowing whether ir(u) > tt(v) for two nodes it, v 
can solve greater-than on numbers of size 0(n\ogn)- 
bits, in the worst-case. 

Proof. We shall use a modified version of the con- 
struction used in Lemma [H except that Bob's side 
will encode a set Q, and Alice will be able to deter- 
mine which set is greater by looking at the value of 
tt(v). Let Bob build his side in a symmetric manner 
to Alice, using his set Q. Now, on both sides, only 



2 The center of a graph G is the set of nodes of maxi- 
mal graph eccentricity, and the eccentricity of a node v is 
max„ d(u, v). 

3 For reversible chains the stationary probability is propor- 
tional to the degree of a node so computing the node of max- 
imal rank is equivalent to traditional leader election - elect 
the leader as the node with highest degree, and of course the 
degree is known to each node. 



the even nodes on the cycle are used to encode the 
elements (as opposed using all the nodes as in the 
construction of Lemma [3]). 

These sets are encoded in reverse, so the least sign- 
ficant bit of the stationary probability represents the 
largest element of the set. More precisely, we can 
show (as in the proof of next theorem) that in the 
simplified case (ignoring constants) that the differ- 
ence of the stationary values elegantly encodes the 
difference of the sets: 



n(u) — n(v) 




UGP keQ 



2 k 



Since the elements are encoded in reverse, we define 
the 'reversed' set P = {j E P \ (\P\ + £ P}- 
Now suppose some node knows which of it, v has high- 
est rank: 



7r(lt) — 7r(u) > 



2 A- 



E 2 " 2i >E 2 

jep keQ 

jep keQ 
P>Q 



The difference tt(u) — tt(v) then reveals the difference 
P — Q, where P, Q are taken wrt their binary expan- 
sions. Since there is a bijection between P and P 
there is no loss in using this representation. 

Therefore if Alice knows whether ir(u) > tt(v), she 
can solve greater-than on numbers of 0(n log n) bits 
(and with the same probability and error if they use 
a randomized protocol). □ 

Since the randomized communication complexity of 
greater-than is 0(logn) yet any deterministic proto- 
col must communicate at least f2(n) bits, the lemma 
suggests that randomization may be of some help 
in solving this problem. As before, we will appeal 



15 



to the linear array result to lift our two-party lower 
bounds onto linear arrays to obtain bounds for the 
distributed case. 

If there are no ties, then the lemma immediately 
implies the same communication bound for comput- 
ing whether rank(u) > rank(v). Now we can use the 
lemma to show the same lower bound for some node 
determining which side of a (specified) partition the 
node of maximal rank lies in. The idea is simple 
but the algebra tedious - if we can modify the con- 
struction to force the top two nodes u, v to lie in 
opposite sides of the partition, then knowing which 
side contains the maximum implies knowing whether 
n(u) > tt(v) or not, and the same result as in the 
lemma applies. 

Theorem 6 Consider a partition {X, Y) of a graph. 
Any deterministic (randomized) distributed algorithm 
that terminates with at least one node knowing 
whether the node of maximal rank is in X must com- 
municate at least Q(n\ogn) bits (Q(\ogn + log log n) 
bits) over fi(n) edges in the worst case. In particu- 
lar, this applies if at least one node v knows if it has 
maximal rank. 



shall add edges x a — > u in Alice's subgraph and 
y a — ► v in Bob's subgraph, and self-loops at nodes 
u,u',v,v'. The idea is that this will force one of u, v 
to be of maximal rank without affecting the funda- 
mental properties of the construction (since the flux 
transferred from x a to u is constant). Therefore v 
can check whether it is the maximum (in which case 
P < Q) and if not, then u must be the maximum (in 
which case P > Q). The full construction is shown 
in Figured] Note the similarity to the construction 
of Lemma O 

We will only work through the details of the con- 
struction for the case where each node Xi on the cycle 
links directly to u or u' (and similarly for the j/j in 
Bob's half of the network) - this will give the 0(n)-bit 
encoding for sparse graphs, and it can be improved as 
before to 0(n log n) for dense graphs by the technique 
of having each x\ link to 0(n) intermediate nodes (in 
which case the algebra becomes quite messy). 

For ease of notation, we shall use pj = ir(xj), and 



Qj = n (yj)- Letting s 
ir(v'), we have: 



tt(u) + 7r(u') and t = w(v) + 



The construction is based on the construction of ^ 
the previous lemma, except that we want v to be able Tr(x a ) = — (ir(u) + n(u )) 
to determine the answer to an instance of greater- 
than by knowing if v has maximal rank. An interest- 
ing feature of the construction is that it does more 

than just encoding a set into the result; the network =>■ s 

itself actually does some computation in solving the 

greater-than instance, and the result appears at v. To => ^(x a ) 

achieve this, we need to build the construction twice, 
once in each partition of the network. This is quite a 
powerful idea and allows us to substantially improve 
on the purely information-theoretic lower bounds. 

Let P,Q C {1, . . . , n}, and all nodes in X (Alice's 
subgraph) know only P and all nodes in Y (Bob's 
subgraph) know only Q. The idea is that if the node 
with maximal rank is in the X partition then P > Q 
(wrt their binary expansions), otherwise P < Q. We 
shall use the construction of the previous lemma, but - s ' + * 

with a modification since the two sink nodes x a , y a 
take the top two spots, and since their values are in- 
dependent of the sets P, Q (by the construction) we 
cannot use them to distinguish between P, Q. We 



(7T(X ) 

(tt(x ) 
((2™ - 



l)i>f] 



f (T 



and similarly Tr(y a ) = ^((2 T 
these equations, we get 



l)3m + |)- Combining 



(2 m 
2(2' r 



l)(Pm 



q m ) + 

f q m )- 



:(* + *) 



l)Pm + 



Since the sum of all stationary probabilities is 1, this 
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gives 

s + 1 + n{x a ) + n{y a ) + {T 
\{s + t) + {T 
A(2 r - 



i)(Pm + 9m) 

l)(Pm + 9 m) 

l)(Pm + 9 m) 
Pm "I - 9m 



both are constants even though they are in different 
partitions; this means that the flux flowing around 
both cycles is independent of the sets P, Q. Now we 
can use this to find the value ir(u): 



where c is a constant that depends only on m = 2n. 
We can also show that both p m , q m are constant, since 



ir(u) + n(u ) 



which solves to give 
2 



(2 m - l) Pm + 



tt{u) 

7r(u) 



(2 m -l)p m + -((2 m -l)q m + 



;) 



2 i 2 

X)2 m - 2, ' + 57r(«)+7r(x ) ) 



s = -(2 m -l)(2jj m + g m ) 



(2) 



^2 m -^+7r( a;Q ) + 



Now we also have p\ = p m /2 + ir(x a )/2 
so ir(x a ) = (2 m — l)p m and hence 

s = 2n(x a ) = 2(2 m - l)p m . 

Combining © and © gives 2(2™ - l)p n 



ym—1 



Pm: 



(3) 



2 (can 



(2 



l)(2p m + g m ) and hence c/2. Therefore A little manipulation, and recalling that p m = q m = 
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z/2, gives 



n(u) 




for some constant c'. 

Now we just need to check that one of u, v is 
of maximal rank. The edge from x a to u ensures 
that u has higher rank than u', and the self- loops at 
u, u' , v, v' ensure that u has rank at least as high as 
x a , since ir(x a ) = |(2 m — 1) < tt(u) (and similarly 
for the other half of the network). By breaking ties 
in favour of u, v, one of them has maximal rank. 

Define P = {j e P \ (\P\ - j + 1) e P}. Since 
one of u, v is of maximal rank, Lemma 3] implies that 
node u is of maximal rank iff P > Q (wrt to their 
binary expansions). 

Now if some node knows that u is of maximal rank 
then P > Q otherwise v has maximal rank and so 
Q < P. As shown in the Figure, we can increase 
the separation factor on the cycle (between successive 
Xi's) to 0{n), and so if Alice knows whether v is of 
maximal rank, she can solve greater-than on sets of 
size 0{n log n) bits. □ 

By the deterministic communication complexity of 
greater-than we have the following corollary. 

Corollary 1 Consider any deterministic algorithm 
that terminates with at least one node knowing if it 
is of maximal rank. Then at least Cl(nlogn) bits must 
be sent over Q(n) edges in the network, in the worst 
case. 

For randomized algorithms, the situation is some- 
what different. The randomized complexity of 
greater-than is fl(\ogn). 

Corollary 2 Consider any algorithm that termi- 
nates with at least one node knowing if it is of maxi- 
mal rank, with probability at least 2/3. Then at least 
f2(logn + log log 7i ) = f2(logn) bits must be sent over 
Q(n) edges in the network, in the worst case. 



An interesting open problem is to find a distributed 
deterministic algorithm for computing the maximal 
node of a Markov chain. 



7 Discussion 

We have presented several lower bounds for an inter- 
esting problem in distributed computing, where the 
structure of the communication network is the input 
to the function to be computed. Our technique is to 
embed an instance of some two-party version of the 
problem into a network, and by appealing to the lin- 
ear array conjecture, lifting the two-party result onto 
a result concerning the total communication of a dis- 
tributed algorithm. We discussed that strengthening 
our results is likely to require a different lifting tech- 
nique, as the linear array lifting only lets us account 
for the flow of data across a linear number of edges 
in n, even though there may be 0(n 2 ) edges present. 
Finding a better lifting technique appears to be a 
general problem in proving good lower bounds for 
distributed computing problems. 

In considering worst-case complexity we have ne- 
glected the graph-theoretic properties of G - it would 
be useful to know how the communication complex- 
ity of the problems is altered by restricting G to say, 
graphs of high conductance (in particular, this would 
seem to reduce the effectiveness of the lifting tech- 
nique because a graph of high conductance could not 
contain two dense graphs separated by a long string of 
edges, as this would then resemble a 'barbell graph'). 

Some of our lower bound reductions involving 
greater-than suggest that randomization may help. 
So far, we have been unable to confirm this but it 
would certainly seem natural, given the random walk 
interpretation of the problem. 
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